Skip to content

[DNM] Refactor rmsd.py#75

Open
hannahbaumann wants to merge 81 commits intomainfrom
rmsd_refactor
Open

[DNM] Refactor rmsd.py#75
hannahbaumann wants to merge 81 commits intomainfrom
rmsd_refactor

Conversation

@hannahbaumann
Copy link
Contributor

@hannahbaumann hannahbaumann commented Jan 29, 2026

Note: DNM since this will break analysis in openfe protocols

  • Renamed ligand_wander to ligand_COM_drift

  • ligand_rmsd and ligand_COM_drift are now more nested, storing information for multiple ligands if present

  • Expose selection indices for protein and ligands

  • Enable ligand RMSD and wander calculation for files containing more than one ligands (e.g. SepTop). Ligands can be selected based on a MDAnalysis selection string. This assumes ligands are in separate residues.

@hannahbaumann hannahbaumann linked an issue Feb 3, 2026 that may be closed by this pull request
@hannahbaumann hannahbaumann self-assigned this Feb 3, 2026
@hannahbaumann
Copy link
Contributor Author

pre-commit.ci autofix

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment transformation is a concern here, there's a few other smaller things.

) -> tuple[mda.core.groups.AtomGroup, list[mda.core.groups.AtomGroup]]:
protein = u.select_atoms(protein_selection)
lig_atoms = u.select_atoms(ligand_selection)
# split into individual ligands by residue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dangerous, you can't guarantee that a ligand will always be a single residue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be probably safer to select by fragments - guessing bonds if necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this!

Comment on lines +35 to +36
ligand_selection: str = "resname UNK",
protein_selection: str = "protein and name CA",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now, but in the refactor just take in atomgroups since we can't guarantee residue / atom names going forward, especially with rosemary.

# - make the ligands not jump periodic images between frames
# - align the ligands to minimise its RMSD
for lig in ligands:
u.trajectory.add_transformations(NoJump(lig), Aligner(lig))
Copy link
Member

@IAlibay IAlibay Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would align the trajectory to minimize ligand 1 and then re-align to ligand 2. I'm not sure it's going to be done this corrrectly - would it not effectively mess up the RMSD for the first ligand?

return u


def twoD_RMSD(positions: np.ndarray, w: Optional[npt.NDArray]) -> list[float]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but please open an issue to switch this to using something better, like this: https://userguide.mdanalysis.org/1.1.1/examples/analysis/alignment_and_rms/pairwise_rmsd.html#Pairwise-RMSD-of-a-trajectory-to-itself

ligand_com_drift : list of list[float] or None
COM drift of each ligand per frame.
"""
traj_slice = u.trajectory[::skip]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably dangerous to do in terms of the MDAnalysis internals (it's not a pure Python object). It is better if you enumerate over u.trajectory[::skip] instead on line 183.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this!

"protein_RMSD": [],
"ligand_RMSD": [],
"ligand_wander": [],
"ligand_COM_drift": [],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note - this will be a breaking change downstream, please don't put this in the next release.

Base automatically changed from fix_rmsd_multichain to main February 16, 2026 15:57
@codecov
Copy link

codecov bot commented Feb 16, 2026

Codecov Report

❌ Patch coverage is 92.85714% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.94%. Comparing base (5260e3b) to head (01e3d15).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
src/openfe_analysis/rmsd.py 92.85% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #75      +/-   ##
==========================================
+ Coverage   88.16%   95.94%   +7.77%     
==========================================
  Files           7        6       -1     
  Lines         338      345       +7     
==========================================
+ Hits          298      331      +33     
+ Misses         40       14      -26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User defined host alignment selection

2 participants